home *** CD-ROM | disk | FTP | other *** search
- $RCSfile: C2Oberon.txt $
- Description: Hints for converting C code into Oberon
-
- Created by: fjc (Frank Copeland)
- $Revision: 1.2 $
- $Author: fjc $
- $Date: 1994/05/13 19:26:48 $
-
- Copyright © 1994, Frank Copeland.
- This file is part of Oberon-A.
- See Oberon-A.doc for conditions of use and distribution.
- ________________________________________________________________________
-
- [This document is only partly complete (there was more, but DME ate it).
- :-( It is here because what there is might still be useful. It will be
- finished for a future release. FJC]
-
- Introduction
- ------------
-
- This document contains advice for translating source code written in C
- into Oberon. It covers both simple syntax conversion and broader,
- program organisation issues.
-
- C and Oberon both belong to the same tradition of imperative, procedural
- languages that trace their origins back to Algol in the late 1950's.
- They are both general purpose languages and are sufficiently low-level
- that they can be used as system-programming languages (C is more
- conciously directed to this use). Where they mainly differ is in the
- areas of modularity and type safety; Oberon is stronger in both these
- areas.
-
- In most cases the syntaxes are similar enough to allow a straightforward
- translation from one language to the other. A text editor with a macro
- facility can help simplify this task. Difficulties arise when
- encountering constructs common in C which are missing in Oberon, such as
- union types and unsigned integers. Other, more subtle difficulties may
- arise as a result of C's less strict (often non-existant) type checking.
- Finally, C's approach to modular programming is completely different to
- Oberon's, and may require a complete restructuring of the source code.
-
-
- Simple Data Types
- -----------------
-
- Simple data types are the basic building blocks of all data structures.
- As they are determined by the underlying architecture common to almost
- all modern computers, C and Oberon have a very similer range of basic
- types. However, while the ANSI C standard specifies minimum sizes and
- ranges for all of these types, Oberon leaves much more up to the
- implementor. The equivalences given are those for Amiga C compilers and
- the Oberon-A compiler.
-
- The following table helps to explain the equivalences:
-
- ANSI C Min. Size Oberon-A
- ------ --------- --------
-
- char 1 byte When used to hold ASCII values, the equivalent
- is CHAR; when used as a small integer, the
- equivalent is SHORTINT.
- unsigned char 1 byte When used to hold ASCII values, the equivalent
- is CHAR; when used as a small integer, the
- equivalent is BYTE. For BYTE see below.
- short 2 bytes INTEGER.
- int 2 bytes INTEGER or LONGINT. Some C compilers have
- 16-bit ints, others have 32-bit ints. For
- 16-bit ints, use INTEGER, otherwise use
- LONGINT.
- unsigned int 2 bytes No equivalent. See below.
- long 4 bytes LONGINT.
- unsigned long 4 bytes No equivalent. See below.
- float 6 digits REAL.
- double 10 digits LONGREAL.
- long double 10 digits LONGREAL.
-
- The BYTE type is strictly limited in Oberon (it is removed in Oberon-2).
- It may be used to represent an unsigned byte-sized value (0-255), but
- the only operation allowed is assignment. To perform arithmetic on BYTE
- values, you must first assign them to INTEGER variables.
-
- Oberon does not allow for unsigned integers apart from the limited BYTE
- type. In many cases this is no problem since a signed integer can be
- used instead without problems, although it may be necessary to use a
- larger type (INTEGER instead of SHORTINT, for instance).
-
- The main problem occurs when you must specify a type with the same
- size as the unsigned type; this often occurs when declaring Oberon
- equivalents for the Amiga system data structures. You cannot use a
- larger type in this case; the system WILL crash if you do. Often the
- operating system defines unsigned constants to be placed in these
- variables with values that are larger than the maximum for a signed
- integer of that size. These must be converted to negative values using
- this formula:
-
- <constant> - (<max value of unsigned type> + 1).
-
- For example:
-
- An unsigned int constant 0xFFFE or 65534 converts to:
-
- (65534 - (65535 + 1)) = -2.
-
- Often unsigned integers are used as bit fields, which are directly
- equivalent to the Oberon SET type. An unsigned char bit field is
- equivalent to the SYSTEM.BYTESET type; an unsigned int is equivalent to
- the SYSTEM.WORDSET type; an unsigned long corresponds to the SET type.
- If the exact size of the variable doesn't matter, use the SET type; the
- other two types are unique to the Oberon-A implementation and are not
- portable.
-
- C has no direct equivalent to Oberon's BOOLEAN type, but ints are often
- used for the same purpose. When they are, a value of zero equates to
- FALSE and any other value equates to TRUE.
-
- Oberon has no equivalent to C's enumerated data types, but they can be
- simulated very easily using integer constants. For instance:
-
- enum days = { Sun,Mon,Tues,Wed,Thurs,Fri,Sat };
-
- becomes:
-
- CONST
- Sun = 0; Mon = 1; Tues = 2; Wed = 3;
- Thurs = 4; Fri = 5; Sat = 6;
-
-
- Constant values
- ---------------
-
- Character literals in C are defined using single quotes: 'a', 'Z', etc.
- String literals use double quotes: "This is a string". Oberon uses
- double quotes for both. This can create an unexpected difficulty.
- Oberon has no way of distinguishing a string literal with a single
- character in it from a character literal except from the context in
- which it is used. If the compiler does not contain special code to deal
- with the anomaly, perfectly legal code may generate errors. (Oberon-A
- now handles this anomaly).
-
- String and character literals in C may contain escape sequences such as
- '\n' and "\x5c". Oberon does not recognise escape sequences (but
- Oberon-A does, as an extension; see OC.doc.) Character literals in
- Oberon may be expressed as hexadecimal ASCII codes instead: '\n' becomes
- 0AX.
-
- In C, any numeric literal starting with "0x" or "0X" is a hexadecimal
- value. Upper and lower case characters can be used for the hexadecimal
- digits. Oberon uses an "H" character at the end of the literal to
- indicate a hexadecimal number and hex digits must be in upper case. A
- number must start with a numeric character, the same as in C. Example:
- "0xa5d3" becomes "0A5D3H". Any integer constant in C starting with "0"
- (except for hex constants, obviously) is interpreted as an octal number.
- Oberon has no equivalent for this and the constant must be converted
- into its decimal or hexadecimal equivalent.
-
- A "U" at the end of an integer constant indicates an unsigned number.
- Oberon has no equivalent for this (see above). An "L" at the end of an
- integer constant indicates that it has a long int type. This is
- unnecessary in Oberon, which handles all the necessary type conversions
- automagically.
-
- Floating point constants are generally defined the same way in C and
- Oberon, except that an exponent must be indicated with an uppercase "E"
- in Oberon. C constants usually are of type double; Oberon defaults to
- type REAL. To specify a LONGREAL constant, Oberon uses a "D" in the
- exponent instead of an "E". C uses an "L" at the end of a floating
- point constant to indicate it is a long double type; remove this when
- converting to Oberon.
-
- Bit field constants in C are usually defined in one of two ways. One is
- to use the form: "(1<<bit)" or "(1L<<bit)". This is equivalent to
- Oberon's: "{bit}". The other is to use a hex literal, such as:
- "0xC801". The only way out here is to determine which bits are set in
- the binary representation of the number, remembering that bits are
- numbered starting at 0. In this case the equivalent is: "{0,11,14,15}".
-
- When used as a boolean, an integer value of 0 is the same as FALSE,
- while any other value is equivalent to TRUE.
-
- Named constants in C are created using the #define macro. In Oberon
- they are defined in a constant declaration block, which starts with the
- CONST keyword. Macros definitions are visible until they are undefined
- or until the end of the file. Oberon constants are visible only in the
- scope in which they are declared, which may be inside a procedure.
-
-
- Operations
- ----------
-
- C has a somewhat richer set of operators than Oberon. Most of the
- missing operations are provided as standard procedures in the Oberon
- language or by the SYSTEM module supplied with Oberon-A. Oberon uses a
- much simpler system of operator precedence derived from Pascal. In
- Oberon you may need to make more use of parentheses to indicate the
- exact order of operations you want. See the Oberon Report for details
- of Oberon's operator precedence rules.
-
- The following table lists the various operations and their equivalents
- in C and Oberon. n indicates a numeric value, i indicates an integer
- value, r indicates a floating value, b indicates a boolean value, s
- indicates a bit field or set value, v indicates a variable, a and b
- indicate I have run out of ideas:
-
- Operator C Oberon
- ------------------ ------------------ -----------------------
-
- unary minus n1 = -n2 n1 := -n2
- unary plus n1 = +n2 n1 := +n2
- logical not !b ~b
- bitwise complement s1 = ~s2 s1 := -s2
- address &v SYSTEM.ADR (v)
- pointer reference v1 = *v2 v1 := v2^
- size of i = sizeof(type) i := SIZE (type)
- i = sizeof(v) i := SIZE (type of v)
- i = sizeof(expr) no equivalent
- increment ++i or i++ INC (i)
- decrement --i or i-- DEC (i)
- multiplication n1 = n2 * n3 n1 := n2 * n3
- integer division i1 = i2 / i3 i1 := i2 DIV i3
- floating division n1 = n2 / n3 r1 := n2 / n3
- modulus i1 = i2 % i3 i1 := i2 MOD i3
- addition n1 = n2 + n3 n1 := n2 + n3
- subtraction n1 = n2 - n3 n1 := n2 - n3
- shift left i1 = i2 << i3 i1 := ASH (i2, i3), or
- i1 := SYSTEM.LSH (i2, i3)
- shift right i1 = i2 >> i3 i1 := ASH (i2, -i3), or
- i1 := SYSTEM.LSH (i2, -i3)
- greater than a > b a > b
- greater or equal a >= b a >= b
- less than a < b a < b
- less or equal a <= b a <= b
- equal a == b a = b
- not equal a != b a # b
- bitwise AND i1 = i2 & i3 i1 := SYSTEM.AND (i2, i3)
- s1 = s2 & s3 s1 := s2 * s3
- s1 = s2 & ~s3 s1 := s2 - s3
- bitwise OR i1 = i2 | i3 i1 := SYSTEM.LOR (i2, i3)
- s1 = s2 | s3 s1 := s2 + s3
- bitwise XOR i1 = i2 ^ i3 i1 := SYSTEM.XOR (i2, i3)
- s1 = s2 ^ s3 s1 := s2 / s3
- logical AND b1 && b2 b1 & b2
- logical OR b1 || b2 b1 OR b2
- assignment a = b a := b
-
- In C, assignment (=) is an operator that may be used in an expression.
- In Oberon, assignment (:=) is a statement. This means that in C you can
- say:
-
- if (c = getchar()) ...
-
- while in Oberon you must use:
-
- c := getchar ();
- IF c # 0X THEN ...
-
- You may have to pay close attention to the C increment and decrement
- operators. If the operator is placed before the variable, the operation
- is performed before the rest of the expression; if after, the expression
- is performed first. This determines where the Oberon INC or DEC
- procedure should be placed.
-
- Oberon and C both use short-circuit evaluation when processing boolean
- expressions involving logical AND and logical OR operators. This means
- that the whole expression may not be evaluated if its result can be
- determined before the end. For example, in the expression (A & B), if A
- evaluates to FALSE B is never evaluated because it is unnecessary; the
- expression result will be FALSE regardless of the value of B. When
- translating such expressions, pay close attention to the operator
- precedence rules to make sure that the Oberon expression has the same
- logic as the C expression.
-
- In C you can replace any expression of the form:
-
- A = A <op> B
-
- where <op> is a binary operator, with:
-
- A <op>= B.
-
- In Oberon, you must convert such an expression back into its first form.
- For example:
-
- (C) A *= B => A := A * B (Oberon).
-
-
- Block Statement
- ---------------
-
- C has the concept of a block statement, which is a sequence of
- statements enclosed in braces that may be used anywhere in place of a
- single statement. It takes the form:
-
- { <statement>; <statement>; ... <statement>; }
-
- The closest equivalent in Oberon is the main body of a procedure or
- module, which is a sequence of statements bracketed by BEGIN and END:
-
- BEGIN <statement>; <statement>; ... <statement> END
-
- C also makes use of block statements in if, for and while statements.
- Oberon has its own syntax for these statements; see below.
-
- C block statements can also include variable declarations whose scope is
- limited to the block statement. In Oberon, such a block must be
- redefined as a local procedure and given a name which is used in place
- of the block statement. See Subroutines below.
-
- Note also that in Oberon semicolons are statement _seperators_ while in
- C they are _terminators_. In C, a statement must end with a semicolon;
- in Oberon, a semicolon is only necessary if another statement follows
- it.
-
-
- Conditional Statements
- ----------------------
-
- C and Oberon both have the if/then/else and case conditional statements.
- The different syntaxes for if/then/else are:
-
- ANSI C Oberon
-
- if (<expr>) IF <bool expr> THEN
- <statement>; <statement> {; <statement>}
- [else [ELSE
- <statement>;] <statement> {; <statement>}]
- END
-
- In C, <expr> does not have to be a boolean expression, it only needs to
- evaluate to a zero or non-zero value. Zero is treated as FALSE,
- non-zero is TRUE. When translating to Oberon, the expression must be
- converted to a boolean expression. For example:
-
- if (v) ... (C) => IF v # 0 THEN ... (Oberon)
-
- In C, the <statement> may be a block statement, in which case the
- semicolon is unnecessary. In Oberon, the block statement is replaced by
- a sequence of statements, seperated by semicolons. In both languages
- the else part is optional. In Oberon the END is mandatory.
-
- In C you may see something like:
-
- if (<expr1>)
- <statement>;
- else if (<expr2>)
- <statement>;
-
- In Oberon, this would be expressed as:
-
- IF <expr1> THEN
- <statement>
- ELSIF <expr2> THEN
- <statement>
- END
-
- The equivalent case statements are:
-
- ANSI C Oberon
-
- switch (<expr>) { CASE <expr> OF
- case <item> : <statements> <list> : <statements> |
- case <item> : <statements> <list> : <statements> |
- ... ...
- case <item> : <statements> <list> : <statements>
- [default : <statements>] [ELSE <statements>]
- } END
-
- In C, only one constant item is allowed per case. In Oberon each case
- may include a list of constants, including ranges of values. In C, ALL
- statements after the activated case are executed, unless a break
- statement is encountered. In Oberon, only the statements associated
- with the activated case are executed. To illustrate this, the
- following statements are equivalent:
-
- switch (today) { CASE today OF
- case Mon : Mon .. Fri :
- case Tue : StdIO.WriteStr ("go work!")
- case Wed : |
- case Thur: Sat, Sun :
- case Fri : IF today = Sat THEN
- puts ("go work!"); StdIO.WriteStr ("clean the ");
- break; StdIO.WriteStr ("yard and ");
- case Sat : END;
- printf StdIO.WriteStr ("relax!");
- ( "%s", StdIO.WriteLn ();
- "clean the yard and "); END;
- case Sun :
- puts ("relax!");
- }
-
- Note the use of the "|" character to seperate the cases.
-
- The default part in C and the ELSE part in Oberon are both optional and
- are executed only if none of the other cases are activated. If no
- default is provided and no case is activated, C simply continues after
- the case statement. Oberon will cause a run-time error if no cases are
- activated and there is no ELSE. To get the same behaviour as C, include
- an ELSE with an empty statement (ie - nothing) after it.
-
- Iteration
- ---------
-
- Subroutines
- -----------
-
- Data Structures
- ---------------
-
- Program Structure
- -----------------
-
- Standard Libraries
- ------------------
-
- Other Issues
- ------------
-
- Example programs
- ----------------
-
-